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(57) ABSTRACT 

A system for retrieving multimedia information is provided 
using a computer coupled to a computer-based network, 
such as the Internet, and particularly the World Wide Web 
(WWW). The system includes a web browser, a graphic user 
interface enabled through the web browser to allow a user to 
input a query representing the information the user wishes to 
retrieve, and an agent server for producing, training, and 
evolving first agents and second agents. Each of the first 
agents retrieves documents (Web page) from the network at 
a different first network address and at other addresses linked 
from the document at the first network address. Each of the 
second agents executes a search on different search engines 
on the network in accordance with the query to retrieve 
documents at network addresses provided by the search 
engine. The system includes a natural language processor 
which determines the subject categories and important terms 
of the query, and of the text of each agent retrieved docu- 
ment. The agent server generates , and trains an artificial 
neural network in accordance with the natural language 
processed query, and embeds the trained artificial neural 
network in each of the first and second agents. During the 
search, the first and second agents process through their 
artificial neural network the subject categories and important 
terms of each document they retrieve to determine a retrieval 
value for the document. The graphic user interface displays 
to the user the addresses of the retrieved documents which 
are above a threshold retrieval value. The user manually, or 
the agent server automatically, selects which of the retrieved 
documents are relevant. Periodically, the artificial neural 
network of the first and second agents is expanded and 
retrained by the agent server in accordance with the selected 
relevant documents to improve their ability to retrieve 
documents which may be relevant to the query. Further, the 
agent server can evolve an artificial neural network based on 
the current artificial neural network, the retrieved 
documents, and their selected relevancy, by iteraiively 
producing, training, and testing several generations of neural 
networks to produce an evolved agent. The artificial neural 
network of the evolved agent then replaces the current 
artificial neural network used by the agents to search the 
Internet, One or more concurrent search of the Internet may 
be provided. 

24 Claims, 8 Drawing Sheets 
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The U.S. Government has rights in this invention pur- 
suant to grant MNA202-97-1-1025 between the National 
Imagery Mapping Agency and Syracuse University. 

FIELD OF THE INVENTIGN lo 

The present invention relates to a system (and method) for 
retrieving multimedia information from a computer-based 
network, such as the Internet, using multiple evolving intel- 
ligent agents, and relates particularly to a system for retriev- 
ing information, in terms of documents or Web pages, at 
network addresses using agents for crawling through the 
Internet and executing searches on search engines on the 
Internet to retrieve documents, in accordance with a user 
inputted query. The system is suitable for a user at a 
computer coupled to the Internet to automatically retrieve 
Web pages from the Internet in accordance with a natural 
language query. 

BACKGROUND OF THE INVENTION 

25 

The Internet is a worldwide network of computers with a 
multitude of sites providing a vast amount of iriformation. A 
major part of the Internet is called the World Wide Web 
(WWW). It represents the sites on the Internet which operate 
in accordance with hypertext transfer protocol (HTTP), 3Q 
commonly called Web sites. To access information on the 
WWW, a Web browser operating on a computer coupled to 
the Internet allows a user to access lo, and the ability to 
receive Web pages from, the WWW. Each Web page rep- 
resents a document formatted in a Hypertext Markup Lan- 35 
guage (HTML) which directs the Web browser on how to 
display the text, graphics, and hyperlinks of the Web page. 
Hyperlinks represent graphical regions of a Web page which 
when selected by a user direct the Web browser to the 
addresses of other Web pages. 

The Web sites may be considered as representing numer- 
ous on-line resources. At present, productive use of such 
on-line resources to the computer user is hampered by the 
huge amount of information present on the WWW. An 
excessive amount of time is required to locate useful data, 45 
and the dynamic and transient nature of such on-line data 
often means that information is lost, overlooked or quickly 
outdated. The result is that on-line users often spend more 
time searching for information than actually using it. Tra- 
ditional solutions to this problem include online indexes. 50 
Online indexes are usually included in popular search 
engines on the Internet, such as Alta Vista or Lycos, A user 
can access the site of a search engine and input a query, and 
then receive a list of addresses of Web pages which could be 
relevant to the query. The databases of indexes are continu- 55 
ally updated, but generally only offers a first-level filter on 
information, thus requiring users to search manually for 
relevant data. Furthermore, due to the great number of Web 
sites having Web pages, such indexes often include 35% or 
less of the number of Web pages available on the WWW. An go 
index/retrieval system having a search engine is described, 
for example, in U.S. Pat. No. 5.748,954. 

To build the individual entries on the indexes of Web 
search engines, software robots or agents are often used to 
search individual Web pages along the Internet to locate Web 65 
pages to include in their index. The software robots are 
typically called Web crawlers, wanders or spiders, since they 
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continuously search Web pages linked to other Web pages. 
The process of crawling the WWW is slow and time- 
consuming due the expansive number of sites on the 
Internet, and includes rules which necessarily limit the 
number of terms to be used. Web crawlers for on-line 
indexes have very limited intelligence, and are focused on 
identifying search terms to be used in the index to be 
cross-referenced to Web pages. Moreover, although compa- 
nies providing Web search engines may use Web crawlers to 
develop their indexes, a typical computer user does not have 
access to Web crawlers, and must rely on querying search 
engines on the Internet to locate Web pages potentially 
relevant to their needs. 

Other approaches for locating information on the Internet 
include directories and catalogs. Online directories, such as 
Web -based Yahoo, compile information on popular topics or 
areas with human aid, but are highly subjective and often too 
general for many information seekers. Online catalogs are 
lists through which a user can scroll and select a Web page 
of interest to review. Such online catalogs are also compiled 
with human assistance but have no associated search 
engines. 

Web-based intelligent agents with neural networks have 
been developed to search the Internet, For example, Auto- 
mony Inc. of the United Kingdom has developed Agentware 
software which uses agents, neural networks and pattern 
matching to identify Web pages to provide categorization 
and cross-referencing of digital information. However, such 
Web-based intelligent agent technology often requires con- 
stant supervision for operation. Queries to be used by agents 
are stated in simplistic abbreviated form. Further, such 
agents do not learn or rely on a single machine learning 
mechanism, and often are limited to queries of text-based 
tasks. They are unable to initiate actions autonomously or 
operate autonomously. These agents further do not evolve 
into new agents which can potentially improve the ability to 
classify Web pages without user intervention, and their 
ability to be trained by user feedback or other knowledge 
inputs are highly circumscribed. Web agents with the ability 
to learn are described, for example, in L, Chen & K. Sycara, 
1998, "WebMate: A personal agent for browsing and 
searching". Proceedings of Autonomous Agents 98, pp. 13 
2-13 8, X Joachims, D. Freitag & T MitcheU, 1998, "Web 
Watcher: A tour guide for the World Wide" Web, Proceed- 
ings of IJCAI 97, and M. Pazzani, J. Muramatzu, D. Billsus, 
1996, "Syskill & Webert: identifying interesting Web sites". 
Proceedings of AA-Al conference. 

Some existing Web agent systems can deploy multiple 
agents for the same core query, as provided by the MetaBot 
search engine, but there is usually no inter-agent commu- 
nication or inter-agent learning. Multiple Web agents are 
used only as a means of speeding the recovery of data, not 
as a means of improving the retrieval performance of the 
system. 

To facilitate searching the WWW for information, meta- 
searching programs have been developed to query multiple 
Web search engines and combine the results of the searches. 
This can provide a more complete search of the WWW than 
can be provided by any single Web search engine. The 
company Agent Technologies Inc. has developed software 
called Copemic98Plus having the capability to search mul- 
tiple content-specific sites and simultaneously searching 
more than a hundred search engines using smart agents. 
Meta-searching programs however are limited to operating 
on the results of searches from Web search engines and do 
not utilize Web crawling to locate documents. 

It is thus desirable to provide a system which allows a user 
at their computer to retrieve desired information on the 
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WWW from their computer by combining the search capa- the natural language processed query, and then embeds the 

bility of Web crawling with the meta-searching of multiple neural network in each of the crawler and meta-search 

Web search engines using agents which learn and evolve as agents. During the search, when each of the crawler or 

the search progresses. meta-search agents retrieves a document, the neural network 

5 of that agent processes the document's associated subject 

SUMMARY OF THE INVENTION categories and important terms from the natural language 

Accordingly, it is the principal object of the present processor to determine a retrieval value for the document, 

invention to provide a 'system for retrieving information For each retrieved document, its network addresses and 

from the Internet, and particularly the WWW, using multiple retrieval value are stored in a database. The agent server 
intelligent agents, which can more efficiently retrieve docu- lO displays to the user, via the graphic user interface, the 

ment than prior art Web agent systems by integrating both addresses of the retrieved document which have a retrieval 

meta-searching and crawler agents. value above a threshold level. The user may use the Web 

It is another object of the present invention to provide a browser to review the retrieved document. The user can 

system for retrieving documents using multiple agents ^^^^^^ ^^ich of the retrieved documents are relevant by 

which are adaptive is the capabihty to learn from the user reviewing the documents at their associated network 

and the experience of other agents, evolve as a group, and addresses, or the agent server automatically select a certain 

operate cooperatively to retrieve the desired information. number of the documents havmg the highest retneval values 

It is still another object of the present invention to provide '.^^^""f^; /^l^vancy of documents is recorded by 

a system for retrieving information from the Internet using ^^""^^ ^^^^^^^^^ b^^' 

multiple agents each having a common neural network in To enable the agents to learn, the agent server periodically 

which the relevancy of documents, i.e., Web pages, retrieved adds inputs to the neural network of the crawler and meta- 

from agents is determined by either user, or automatically by search agents in accordance with selected relevant docu- 

the system, for expanding, training, and evolving the neural ^n^nts based the frequency of the associated subject catego- 

network of such agents. and important terms provided by the natural language 

Yet another object of "the present invention is to provide a Processor, and then retrains the neural network using test 

system for retrieving documents which operates autono- P_^"^^f " ^^^J^^^ categories and important terms of 

mously on behalf of the user to retrieve desired information. ^^^^^^^ documents. 

It is a further object of the present invention to provide a 'To enable the agents to evolve, the agent server randomly 
system for retrieving information from the Internet using Produces a first generation of agents each havmg a neural 
multiple intelHgent agents and natural language processing ^^^^^^If ^^^h a different subset of the inputs (i.e., subject 
of the query for building the artificial neural network for the categones and miportant terms) of the cunent neural net- 
agents, and natural language processing of the retrieved ^^^^ ^^^^ ""'f^^^' meta-search agents. Each of 
documents to be appUed to the artificial neural network of ^^^^ generation of agents' neural networks is first tramed 
gggjjjg using a group of the retrieved documents and then tested on 

A /'ii 41 *u . e^u *• *• • * -J their accuracy (fitness) in predicting the relevancy of another 

A still further object of the present invention is to provide , r x • jj . 

f , . . ' e *• e *u T » * • different group of the retrieved documents. The next gen- 

a system for retnevmg miormation from the Internet using ^. T i ^ i • j j u • 

i^. 1 ^ . • t L r eration oi new neural networks is then produced having 

multiple agents in which the mformation received can be of , • . • j.rr . . . rf. * . r 

A'ff * A' * ^ mputs agam having a different subset of the inputs of the 

one or more different media types. i .i jl^l i u 

„ , , . . . . neural network used by the crawler and meta-search agents. 

Briefly descnbed the present invention embodies a sys- 40 ^^^^ ^ ^^^^^ including the inputs of the 

tern for retrieving information on a computer coupled to a ^^iflcial neural network of agents which provided better 

computer-basednetwoik, such as the Internet in accordance jj^io^ ^i^^ance and non-relevance. The 

with a query. The system includes a Web browser and a ^^^^^ ^^^^^ ^^^^ „f successive generations of 

graphic user interface through which the Web browser ^^^^j^^^ ^ ^^^^ ^^^^^ ^^^^ ^ maximum number of 

enab es a user to mput mformation defining a user search 45 produced, or the generations stabUize. Hie 

profile, mcluding a naniral language query, the media typeof ^ „f ,^6 last generation with the best prediction accuracy 

document desired, and any starting network addresses. Tlie ^ j^^^ ^^j^^^j^ embedded in each of the 

system further includes an agent server for producmg mul- ^^^^j^^ meta-search agents. Thus, agents both learn and 

tiple crawler agents and meta-search agents under an agent ^ j„ inter-agent communication is 

leader associated with the user profile. The agent «!rver 50 ^^^j^^^ ^ documents retrieved by all crawler 

stores recorcfa m a database, via a database server, definmg meta-search agents for learning and evolving. Further, 

the user profile for the agent leader and other informaUon, information of the documents retrieved may be of one or 

including the search results. Each crawler agent retneves ^^^^ ^^^^ ^^^-^ ^^^^ ^ . ^^^^^ 

documents trom the network at a different starting network ,„ j^„ ^, „„„ „^ tu^ ,.<,^ ai^ 

, . , , - , , video, or any, as denned in the user pronle. 

address and at other addresses linked from the document at 55 ^ . , , ^ , , , 

the starting network address, and so on. Each meta-search , ^urmg the search, one or more of the addresses of the 

agent executes a search on different search engines addres- document retrieved fi-om the meta-search agents may 

sable on the network in accordance with the query to retrieve become a new starting address for a crawler agent to search 

documents at network addresses provided by the search ^WW. The search contmues until stopped by the user, 

engine. A natural language processor enables the agent 60 BRIEF DESCRIPTION OF THE DRAWINGS 
server to determine the subject categories and important 

terms of the query, and determines the subject categories and The foregoing objects, features and advantages of the 

important terms of the text of each agent retrieved docu- invention will become more apparent from a reading of the 

ment. The agent server uses the subject categories and following description in connection with the accompanying 
important terms from the natural language processed query 65 drawings, in which: 

to establish an initial set ofinputs for a neural network, trains FIG. 1 is a block diagram of the system in accordance 

this neural network in accordance with test patterns based on with the present invention; 
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FIGS. 2 A and 2B are connected flow charts showing the the noedia type of documents to be retrieved, such as text, 

operation and programming of the system of FIG, 1; graphic, audio, video, or any, or documents at particular 

HGS. 3A and 3B are examples of the graphic user domain types, such as org, .com, or .gov. Tde GUI 21 may 

interface of the system of FIG. 1; constructed from JAVA applets to bmld the windows to 

^ , r. . , . i 5 input and display information from the agent server, and 

HG. 4 IS a flow chart showmg the operation and pro- ^^^uons to execute functions. An example of the GUI 21 will 

grammmg for trammg the neural network embedded m ^e described later in connection with FIGS. 3A and 3B. 

agents searching the WWW in the system of FIG. 1; and ^he natural language processor 24 may be any natural 

FIG. 5 is a flow chart showing the operation and pro- language processing means capable of analyzing text to 

gramming for evolving the neural network embedded in determine, at a minimum, the key terms associated with the 

agents searching the WWW in the system of FIG, 1. text. Preferably, the natural language processing is provided 

as described in U.S. Pat; No. 5,873,056, or the subject 

DETAILED DESCRIPTION OF THE categories and important terms in the natural language 

INVENTION processing described in U.S. patent application Ser, No. 

Referring to FIG. 1, the system 8 of the present invention 35 08/696,702, which are herein incorporated by reference, 

is shown having a computer system 10 coupled to a display ^ther artic^s describing the natural language processing are 

12 and a user interface 14 such as a kevboard and mouse ^' ^^^^^ ^^o^ument retrieval using hnguistic 

12 and a user intertace 14, such as a i^eyboard and mouse knowledge," Proceedings of RIAO '94 Conference, 1994, 

Computer system 10 represents a typical desktop pei^onal ^ ^ categorization for multiple 

computer, lap-top computer, or workstation of a user. Coni. ^ased on semantic information from a MRD," ACM 

puter system 10 is coupled to the Internet, and particularly 20 Transactions of Information Systems, July 1994. The dic- 

the World Wide Web (referred to herein as WWW or the tionary (or thesaurus or lexicon) described in the U.S. Pat. 

Web) 15, via a network interface 16, such as a modem, LAN, No. 5,873,056, as well as the Military Handbook 850: 

or cable to an Internet Service Provider. Alternatively, Glossary of Mapping Charting, and Geodetic Terms may be 

computer system 10 may be a network computer server stored in memory 19 of the system and used by the natural 

coupled to the Internet via a high -band width Internet 25 language processor 24 to identify the subject categories and 

connection, such as a shared Tl Hne. Other peripheral important terms present in text. 

devices, not shown, such as a printer or CDROM, may also The agent server 22 operates in accordance with the user 

be coupled to computer system 10. The computer system 10 profile, received via the GUI 21, to generate multiple agents 

further includes a hard-disk drive 18 and memory (RAM) 19 28 embedded with a common trained artificial neural net- 

for program and related data storage. 30 work and sends such agents to access Web sites along the 

The following terminology will be used in this descrip- Internet 15 and retrieve documents therefrom. The natural 

tion. The term agent refers to a software component which language processor 24 is coupled to the agent server via the 

functions continuously and autonomously along the WWW Internet communication protocol TCP/IP to facilitate the 

and has artificial intelligence in the form of a neural network transmission of data to the natural language processor. The 

to learn as it carries out retrieval tasks. The term document 35 natural language processor 24 is utilized by the agent server 

refers to an HTML Web page retrieved by an agent at an 22 to determine the subject categories and important terms 

address on the Internet. Each document may have text, of the query of a user profile. Using this information, the 

graphics, and hyperlinks to other HTML Web pages, as agent server 22 builds an artificial neural network and 

typical of HTML Web pages. TTie term address refers to a generate an initial set of training patterns for the neural 

Universal Resource Locator (URL) on the WWW of a 40 network. The artificial neural network represents a typical 

document retrievable from a Web site. The term query three level feed-forward artificial neural network having an 

represents text defining the information the user wishes to input layer, a hidden layer, and an output layer of artificial 

retrieve in documents from the WWW. The term training neurons in which each path from one neuron to another has 

refers to the determination of weights for an artificial neural a weight. The input layer represents input artificial neurons 

network based on training patterns, and the term evolving 45 in which one input is provided for each subject category and 

refers to the creation and training of new generations of important terms from a natural language processed query, 

agents having artificial neural networks which can better The output layer consists of a single output neuron and the 

classify information than their parent agents. hidden layer represents the artificial neurons between the 

The computer is programmed in accordance with software input and output layers. The agent server 22 trains the 

providing the following components, which will be 50 artificial neural network to determine a retrieval status valve 

described later in more detail: a Web browser 20, a graphic (called herein after retrieval value) based on the firequency 

user interface (GUI) 21, an agent server 22, a natural or absence of the subject categories and important terms of 

language processor 24, and a database server 26 coupled to the query as determined by a real number value between 0 

adatabase27. The Web browser 20 may be any typical Web and 1, respectively, at each input of the artificial neural 

browser software, such as Microsoft Internet Explorer or 55 network. At the input of the artificial neural network, the real 

Netscape Navigator, to access sites on the network 15 via the value is the number of times the subject category or impor- 

network interface 16. The GUI 21 is an HTML page (or lant term associated with that input appeared in the natural 

linked HTML pages) enabled through the Web browser 20 processed text divided by the total number of times all 

at a location (or file) on the hard-drive 18. GUI 21 defines subject categories and important words appeared in the 

the screen or screens for enabling a user to input information 60 natural processed text. For example, if the namral language 

defining a user search profile (referred to hereinafter as user processed query had three words, two being the same subject 

profile) to view the results of an ongoing search, such as categories and the third an important term, then the input 

addresses (URLs) of retrieved documents, to select which of associated with the subject category would be 0.67 (2/3) and 

the documents are relevant, and to link to the address of the input associated with the important term would be 0.33 

retrieved documents on the WWW through the Web browser 65 ('/j)- 

20. The information of a user profile includes at least a Each document retrieved by an agent is sent by the agent 

query, but can define the starting addresses to be searched, server 22 to the natural language processor 24 to obtain the 
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subject categories and important terms of the text within the 
HTML file associated with the document. The agent server 
22 returns this information to the agent which retrieved the 
document which sets each artificial input neuron based on 
the frequency or absence of the subject categories and 5 
important terms in the natural processed text of the docu- 
ment as determined by a real number value between 0 and 
1. At the input of the artificial neural network, the real value 
is the number of times the subject category or important term 
associated wath that input appears in the natural processed 
text of the document divided by the total number of times all 
subject categories and important words of the query (at all 
inputs) appear in the natural processed text of the document. 
If the category or term associated with an input is not present 
in the natural language processed text of a document, then 
that input is set to "0". The output neuron provides a 
retrieval value in the range of 0 to 1 for the document, where 
the higher the value the greater the proximity (or match) of 
the content of the document is to the natural language 
processed query. As documents are retrieved, the agent 20 
server 22 enables the agents to learn and neurogenically 
evolve their artificial neural network based on agent 
retrieved documents. 

The agent server 22 can enable multiple searches of the 
WWW under different user profile information to take place 25 
concurrently or successively as the user directs, by provid- 
ing an agent leader 23 within the agent server 22 for each 
user profile for creating, training and evolving multiple 
agents. The programming and operation of the agent server 
22 will best be described later in connection with the flow 39 
charts of HGS. 2A and 2B. 

The database 27 includes tables having linked records for 
storing information for each search of the WWW. The 
database server 26 represent software, such as Postgres, 
Oracle, or Microsoft SQL Server, which updates (adds, 35 
delete, modify) the records in the database in accordance 
with transactions received from the agent server 22. The 
database 27 contains for each user profile entered by a user 
through the GUI 21 a record in an Agent Leader Table 
having fields for storing information about the search: the 40 
original query provided by a user; the subject categories and 
important terms from the natural language processed query; 
the starting addresses for Web crawling; search results 
representing the addresses (URLs) of each retrieved 
document, their retrieval value, a relevancy bit indicating 45 
whether the document was selected as relevant, and an 
optional unique document identifier assigned by the agent 
server to the document; information defining the artificial 
neural network including the inputs (i.e., number of input 
neurons and their subject category of important term), the 50 
hidden layer neurons, the output neuron, and weights of all 
branches between neurons; the user profile defining the type 
of documents or other user preferences. Other data structure 
may also be used to store the same information, for example, 
a data field of a record in the Agent Leader Table may have 55 
an identifier linked to stored records in other related tables. 

The subject categories and important terms associated 
with each retrieved document may be stored with the search 
results in the record of the Agent Leader Table. However, the 
database may further include a Processed Document Table 60 
having records storing, for each document, the subject 
categories and important terms of the natural language 
processed text of the document. The records in the Processed 
Document Table may be linked to the stored search results 
in the Agent Leader Table by document identifiers, llie 65 
agent leader is capable of retrieving, adding, updating, and 
removing records from the Agent Ijeader Table and record of 
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its related tables. Further, each subject category and impor- 
tant term is associated in an Ontology Table stored in 
database 27 with a unique code (or an identifier) used by the 
system for internal processing purposes. For example, each 
code may be a unique 32-bit number, and the output of the 
natural language processor may actually be a series of codes 
representing the subject categories and important terms of 
the inputted text. 

The computer system 10 may further include a start-up 
program, such as a batch file, which when executed by a user 
executes the programs stored on the hard -drive for running 
the components 19-26. llie software of the agent server 22 
may be programmed using the JAVA programming language 
in combination with C++ which defines program elements in 
terms of classes and objects. For example, in JAVA each 
agent leader represents a class which enables agent program 
objects to search the WWW. However, programming may be 
in other programming languages. 

Referring to FIGS. 2A and 2B, a flow chart of the 
operation and programming of the computer system 10, and 
particular agent server 22, is shown. First, a user accesses 
the GUI 21 through the Web browser 20 at address line 59 
and enters information of a user profile defining the desired 
search (step 30). One screen of the GUI 21 on display 12 
may be, for example, the page shown in FIG. 2 A. As in 
typical HTML pages, the selecting of buttons or drop down 
menu items on the GUI 21 is facilitated by clicking the 
mouse of the user interface 14 over the screen area associ- 
ated with the button or menu item. Each user profile is 
defined by a name to identify both the user profile and the 
associated agent leader. This name is inputted in data field 60 
by the user via the keyboard 14. The user then clicks on the 
create agent button 62a to establish the agent, and the setting 
button 62b to receive page 64 allowing the user to enter the 
information (setting) defining the user profile to search the 
WWW under the agent leader. This information includes the 
query (data field) 65, the search type 66 (i.e., a drop down 
menu to set to the type of multimedia information to be 
retrieved, such as graphic — gif files, audio, video, text, or 
any type), and any starting page addresses 68. The starting 
page addresses represent the addresses at which different 
crawler agents will start searching the WWW. The query 
may be, for example, up to 100 characters. The starting 
pages may be added by the user entering the address in data 
field 68 and then clicking on an add starting page button 70a. 
The starting page addresses will appear in the box 71 
representing the current starting pages. To remove a starting 
page, the user clicks on an address in box 71 until 
highlighted, and then on a remove starting page button 70b. 
The user is not required to enter any starting page addresses. 
The automatic timeout data field 69 may be entered with a 
number representing the number of minutes the system will 
wait to allow the user to manually select the relevant 
document retrieved before automatic relevance feedback is 
performed, as will be described later. 

If the user wishes, a Process Query button 76 may be 
selected, which directs the agent server 22 to send the query 
to the natural language processor and show the results to the 
user in data field 64, such that the user may review the 
results of the query prior to starting the search. To assist the 
user in selecting stating page addresses, the database may 
store a table having records by subject categories listing 
recommended starting addresses associated with such sub- 
ject categories. If the Process Query button 76 is selected, 
the agent server 22 checks such records for any subject 
categories of the natural language processed query, and 
displays them through a recommended pages box 75 of the 
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GUI 21. The user may double click on any addresses engines on the WWW (step 42). Each meta-search agent is 

appearing in box 75 to add them to box 71, such as shown, assigned to a different search engine on the WWW. General 

for example, in FIG. 3A.Additionalinformation for the user search engines may be for example, Lycos, AltaVista, 

profile may also be added by data field or drop down menus, Yahoo, Snap, and others, while specialized search engines 

such as the desired domain extension to be searched, and an 5 may be those dedicated to a particular area of information, 

evolve time in terms of a time which when matching the for example, Getty Thesaurus of Geographical Names 

computer's clock 10 directs the agent server 22 to evolve the (http://www.ship.getty.edu/tgnj3 browser^ or The Art & 

agents. The evolve lime may default to midnight. To start the Architecture Thesaurus Browser (http://www.ship, getty.edu/ 

search in accordance with the user profile information aat^gbrowser/). For each meta-search agent, the agent server 

entered on page 64, the user clicks with an apply button 72, jq 22 converts the subject categories and important terms of the 

otherwise, the user may click on button 74 to delete box 64 natural language processed query into a search query for 

and any user inputted information therein. input to the engine. This is needed to account for differences 

With the user profile information entered, the agent server in how searches are formatted on different search engines. 

22 receives and processes the information from the GUI 21 For example, if the query is "I would like information about 

(step 32), If the query has not yet been processed by the ^5 Earth", the natural language processed query may consist of 

natural language processor, the agent server 22 sends the planet and Earth, a subject categories and an important term, 

query to the natural language processor 24, which processes the search query may be "query=planet+Earth", where "+" 

the text of the query and returns to the agent server the indicated the boolean AND for the search engine. Each 

subject categories and important terms of the query. The meta-search agent connects to the search engine at their 

agent server 22 then creates an agent leader for the user 20 address on the WWW, stored in a file in database 27, enters 

profile (step 34) in which a record is created in the Agent the formatted query at the search engine's Web page, and 

Leader Table of database 27, via the database server 26, executes the search and retrieves the documents one at a 

storing the original query, and the natural language pro- time from the results pages provided by the search engine, 

cessed query, and other information received from page 64 This is achieved by the meta-search agent's capability to 

of the GUI 21. In parallel with step 34, the agent server 22 2s recognize each of the URL addresses in the HTML code of 

initializes the artificial neural network using the processed the results page of the search engine, 

query from the natural language processor (step 36). To The crawler agents are sent directly to Web sites (step 44). 

achieve this, an input neuron is defined for each subject Each of the crawler agents can be sent to a first Web address 

category and important term, and an output neuron is defined to retrieve the document at that address to the computer 

with a layer of neurons therebetween (generally equal to the 3Q system 10, and then proceed to retrieve other documents at 

number of input neurons), where the weights of the con- other Web address defined in hyperUnks of the document of 

necting branches between neurons are to be determined by the first Web address, and so forth. No restrictions need be 

training. Two training patterns are created based on the placed on the number of levels of linked documents from the 

natural language processed query: one pattern indicating a document at the first Web address. If any crawler agent 

relevant document is present by each of the inputs being "1" 35 locates multiple link addresses in a document, the address is 

and the output "1", and the second pattern indicating the temporarily stored in a queue in memory 19 until the same 

absence of a relevant document by each of the inputs being of another crawler agent is available to retrieve a document 

"0" and the outputs "0*'. A back-propagation learn algorithm from the WWW associated with that address. The agent 

is used to detennine the weights using the two training leader retrieves any starting addresses stored in the record 

patterns, as developed by Rumelhaut, such as described in Y. 40 for the agent leader in the Agent leader Table, and sends one 

Chauwin & D. E. Rumelhart (eds), Backp rogation: theory, crawler agent to each of the starling addresses. The agent 

architectures, and applications, Lawrence Erlbaum (1995), leader continues to add to the start address list in the record 

Information defining the trained neural network is added to of the Agent Leader Table a predefined number of the top 

record of database 27, via the database server 26. The user URL addresses provided by the meta-search agents, such 

through GUI 21 may manually instruct the agent server 22 45 that crawler agents can start crawUng from such URL 

to perform step 36 before clicking the apply button 72, such addresses. This is indicated by arrows 41 and 43 from steps 

as by first clicking on the process query button 76 and then 40 and 42, respectively. For example, the first ten addresses 

on a train neural button 77 (FIG. 3A). retrieved by each meta-search agent may be added to the 

Next, the agent leader of the agent server 22 for the user start address list, 
profile generates a team of agents in which each agent is 50 The number of crawler agents is variable. The agent 
embedded with the trained artificial neural network from leader can dynamically create new crawler agents and delete 
step 36 (step 38). For purposes of illustration, the agents old crawler agents, as needed within the available comput- 
under an agent leader are denoted as 28 in FIG. 1. There are ing resources of the computer system 10. The agent leader 
two types of agents generated: crawler agents and meta- can reuse existing crawler agents which have stopped crawl- 
search agents. Each of these agent types is capable of 55 ing due to all addresses linked to their stating address have 
connecting to a Web site at an address on the WWW through been retrieved. If the computer has insufiScient computing 
the Web browser and thus establishes a session with that resources to create all the crawler agents or meta-search 
Web site. Thus, when connected, an agent represents a agents, the agent leader waits until such resources become 
network chent to Web site, i.e., the network server at that available to send the agent to the WWW. If a crawler agent 
site, and the document (HTML page) at that address can be 60 task is completed in that all linked documents from the first 
received by the agent at computer system 10 (FIG. 1). The address have been retrieved, or a meta-search agent task is 
agents timeshare the Web browser's connection to the Inter- completed in that all documents from an executed search 
net. However, if computer system 10 had a high bandwidth engine have been retrieved, the agent leader terminates the 
Internet connection, multiple concurrent connections could agent, thus freeing computer resources to allow the agent 
be established to the Internet. 55 server to create new agents. 

ITie meta-search agents are sent to general purpose search The agent server 22 at steps 40, 42, and 44 determines 

engines on the WWW (step 40) and specialized search whether the documents retrieved by agents include a par- 
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ticular media type when such media type was selected by the 
user in the user profile. For example, if only graphics was 
selected. Different media types are recognizable by being in 
a different format, code, or tag when received in the HTML 
code representing the document. When a particular media 
type is selected by the user, the document is not processed 
by the system except for identifying any further hyperlink 
addresses for crawling by crawler agents. 

As each crawler agent and meta-search agent retrieves a 
document, the agent server sends the document to the natural 
language processor 24 to obtain the subject categories and 
key terms of the text of the document, and filters this 
information through the agent's embedded neural network 
(step 45), This is achieved by setting any of the input 
neurons of the artificial neural network associated with 
subject categories or important terms with a real number 
based on the frequency of the subject categories or important 
terms occurring in the natural language processed document, 
such that the value from the output neuron represents the 
retrieval value for the document. As describer earlier, the 
real value number at the input of the artificial neural network 
represents the number of limes the subject category or 
important term associated with that input appeared in the 
natural processed text divided by the total number of times 
all subject categories and important words of all inputs 
appeared in the natural processed text of the document. 

The agent server 22 displays each of the documents from 
the search to the user through the GUI 21 which are above 
a threshold retrieval value, such as 0.3 (step 46). These are 
called matches. The results are outputted, for example, at 
window 78 in FIG. 3B. The agent leader ranks by their 
retrieval value in window 78, and continuously updates the 
rank as new documents are retrieved by agents. Each entry 
on the list of documents in window 78 represents the address 
(URL) of a document. However, additional information may 
be provided, such as the documents* retrieval values. The 
area of each document address on the GUI 21 represents a 
hyperlink to the Web site, which may be double -clicked 
upon by the user to review the document. 

All results are also stored in the database in terms of the 
address of the document, its retrieval value, and a relevancy 
bit, which may be set as described below. Due to the large 
number of documents which may be retrieved, the agent 
server 22 may retain only a certain number of documents in 
the search results of the database, such as 100 or 200, having 
the highest retrieval values. As stated earlier, an identifier 
may be assigned to the document in the database to link the 
document to a record in the Processed Document Table 
storing the results from the natural language processor for 
the document. 

As indicated by step 48, the crawler agents continue to 
search Web sites and retrieve documents. The meta-search 
agents also continue to retrieve the documents appearing in 
the results page(s) of their respective search engines, 
however, their operation will eventually cease when all such 
documents from results page(s) have been retrieved. 

The user at step 50 has the option of selecting the most 
relevant documents on the display. Such documents repre- 
sent a new training set for the artificial neural network. In the 
example of FIG. 3B, the user may single-click upon the area 
of a document address in window 78, and then on button 80 
to indicate that the document is relevant to the query by 
adding the document to the training set. Alternatively, a 
radio button or check box may be displayed adjacent each 
listed document to enable the user to select a document as 
relevant, and further enable the user to change the docu- 
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ments considered relevant. When a document is selected by 
the user as relevant, the agent server, via the database server, 
records this in the database by setting the relevancy bit 
associated with the document to "1", otherwise the rel- 

5 evancy bit is "0". If the user changes a document from 
relevant to non-relevant, the agent server changes the rel- 
evancy bit of the document accordingly. The user can select 
which documents are relevant at any time during the search. 
Periodically, the agent server 22 retrains (or trains) the 

10 artificial neural network common to all agents under an 
agent leader in accordance with the training set of relevant 
documents (step 54). The interval between training sessions 
may be a parameter set by the user. For example, the interval 
may be 15 minutes. When training is to about to occur, if the 

35 user has not selected any relevant documents, the agent 
server 22 automatically performs relevance feedback at step 
52 by considering the top X number of document having the 
highest retrieval value as relevant and includes such docu- 
ments in the training set by setting their relevancy bits in the 

20 database to "1". For example, X may equal 10, however 
other numbers may be used. If the user has selected less than 
X number of documents as relevant, the automatic relevant 
feedback may be performed to supplement the number of 
documents in the test set until X documents are present. 

25 Similar to the user selected relevance, the documents which 
are considered relevant are indicated in window 78. The user 
at step 50 may later add or change the relevancy status of any 
document whether automatically or manually selected as 
relevant. The agent leader can change a document from 

30 relevant to non -relevant by changing the relevancy bit, but 
it cannot affect the relevancy bit of a document once selected 
relevant by a user at step 50. Memory 19 stores a list of any 
documents selected relevant by the user by the document's 
address, such that such documents are excluded from any 

35 future automatic relevance determination at step 52. The 
agent leader records in memory 19 a list of the documents 
automatically determined relevant by their address. Thus, 
for example, a user may wait an hour after a search com- 
mences until performing user relevance feedback, such that 

40 six training artificial neural network training sessions would 
occur. 

Referring to FIG. 4, the retraining of the artificial neural 
network at step 54 is described in more detail. First, a 
training set of relevant documents is established in memory 

45 19 by retrieving any documents stored in the search results 
of the database for the user profile having a relevancy bit of 
"1". The agent server adds to each document in the training 
set their subject categories and important terms as stored in 
the records of the Process Document Table (step 82). Next, 

50 the agent server determines the frequency, in terms of the 
number of documents of the training set, each of the subject 
categories and important terms occur in the training set, and 
ranks the subject category or term is from most to least 
frequent in documents (step 84). This may be achieved by 

55 statistically counting the number of documents of the train- 
ing set each different subject category or important terms 
occur. The subject categories and important terms which 
occur in at least half of all documents are then selected (step 
86). The natural language processed query is expanded to 

60 include the selected subject categories and terms (step 87). 
The database is modified by the agent server, via the 
database server, to add the selected subject categories and 
terms to the stored natural language processed query. 
While the agents continue to use the current artificial 

65 neural network, the agent server at steps 88, 90 and 92 
modify and retrain the artificial neural network, which when 
complete, will replace the artificial neural network embed- 
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ded in each agent, thereby enabling the agent to learn. At 90, except the desired output is "1" if the relevancy bit for 

step 88, an input node is added to the existing artificial a document is set to "1" or "0" if the relevancy bit is set to 

neural network for each of the selected subject categories "0". Using the same training technique described earlier in 

and terms. At step 90, training patterns are generated based connection with step 36, each of the first generation agcnt*s 

on the documents in the training set. For each document in 5 neural networks is trained based on their respective training 

the training set, an input pattern is generated which should g^t (step 106). After the first generation agents are trained, 

lead to an output of "1", i.e., a relevant document, from the t^e agent server applies each agent to each of the documents 

neural network, such that inputs of the neural network ^ ^^e test document set to determine how may document 

associated with subject categories or important tenns have a g^ch agent correctly classifies as relevant and non-relevant, 

real value number between 0 and 1 based on the frequency determined by the relevancy bit of the document (step 

of the occupance of the subject categories and important jogy Adocument is correctly determined relevant if it value 

terms of the document, as described earUer. Using the same ^^s above 0.5 and the relevancy bit for the document was 

training technique described earlier in connection with step ^ document is correctly determined non-relevant if 

36, the neural network m accordance with the expanded ^^lue was below 0.5 and the relevancy bit for the 

query in trained based on the training set of step 90. The ^5 document was "0". For each of the first generation of agents, 

training may be considered retraining in which the current ^ fitness function is determined defined by the ratio of the 

weights are used, or training in which all the weights of the number of document correctly classified to the number of 

artificial neural network are determined. The trained neural document in test document set is determined. The agents are 

network then replaces the embedded artificial neural net- then ranked by their fitness function from best to worst 

work of each agent under the agent leader, and is stored in 20 classifiers (step UO). Next, the top M number of agents are 

the neural network information in the database by the agent identified as ranked by their fitness function, for example, M 

server, via the database server (step 94). The user may ^qual two (step 112). The agent server then checks if a 

manually instruct the agent leader to perform step 54 maximum number of generations have been produced (step 

through the GUI 21, such as by clicking on a retrain button 113) example, the maximum number may be twenty 

62) (FIG. 3B). 25 generations, but other number of generations may be used. 

After retraining of the artificial neural network of each if the maximum number of generations has been reached, 

agent based on the training set oL relevant documents is the yes branch is taken to step 116. At step 116, the evolution 

complete, the agent server 22 checks if it is time to evolve of agents is complete, and the agent server replaces the 

the embedded neural network of each agent at step 56 of artificial neural network used to search the WWW by each 

FIG. 2B. If so, the agent server will evolve the neural 30 of the crawler and meta-search agents with the evolved 

network based on the user or automatic relevancy feedback artificial neural network of the top ranked agent of the last 

indicated by the relevancy bits in the search results stored in generation. Information on the evolved neural network 

the database (step 58). The evolution time may be a clock replaced the information of the old neural network in the 

time set by the user via the GUI when the user profile was database by the agent server, via the database server, 

entered, or may be on a periodic interval. For example, if a 35 Furthermore, the natural language processed query is 

search commenced at 9 PM, the user may select the evolu- revised to include the subject categories and important terms 

tion time at 1 AM each day, or the evolution may periodi- associated with the input neurons of the evolved artificial 

cally at other intervals. neural network. 

Referring to FIG. 5, the evolving of the artificial neural maximum number of generations has not been 

network at step 58 is described. The agent server 22 first 40 reached, the no branch is taken to step 114. At step 114, a 

obtains the documents stored as search results in the data- second generation of agents is reproduced each having a 

base through the database server and temporarily stores subset of neural network inputs (or features) of one or more 

them in memory 19 by their address with their subject of the inputs of the current neural network embedded in 

categories and important terms (step 96). A majority of the crawler and meta-search, where the inputs (subject catego- 

documents are allocated as a training document set and the 45 ries or terms) of the higher ranked agent having a higher the 

remaining a test document set (step 98). For example, in the probability of occurring in agents of the next generation. The 

case where the search results stored a hundred retrieved probability that an agent will be a parent to the next 

documents, ninety would represent the training document generation is shown in the following equation: 
set and ten the test document set. The agent server 22 then 

generates a number of first generation of agents (step 100), 50 p*(Upy-^ 
and provides each such agent with a different neural network 

having a different subset of one or more of the inputs (or where p is the probability that the highest ranked agent 

features) of the current neural network embedded in crawler will be selected, which for example is 0.6, and n is the 

and meta-search agents of steps 40-42 (step 102). The agent's rank from step 110. Thus, the top ranked agent 

subject categories and important terms used as inputs for 55 has a probability of 0.6, the next ranked agent has a 

each first generation of agent are randomly selected using probability of 0.24, the next ranked agent has a prob- 

typical random number techniques in which each input has ability of 0.096, and so forth for each subsequently 

an equal probability of occurrence. The number of agents in ranked agent. To select each agent, a random number 

each generation may be twenty, however, other number of generator outputs a real number value between 0 and 1, 

agents may also be used. The artificial neural network is 60 such that if this value is between 0 and 0.6 the top 

structurally the same as the artificial neural network ranked agent is selected, between 0.6 and 0.84 (0.6+ 

described earlier, except each network has a different set of 0.24) the next ranked agent is selected, between 0.84 

inputs. and 0.936 (0.6+0.24+0.096) the next ranked agent is 

For each first generation of agent, a series of training selected, and so forth for each subsequent ranked agent, 

patterns for its artificial neural network are generated based 65 The inputs of the artificial neural network of the two 

on the training document set (step 104). This is identical to selected agents determine the inputs of the new agent in 

the generation of training patterns described earlier at step which half of the inputs are randomly selected from the 
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first selected agent and half the inputs are randomly 
selected from the second selected agent. This is 
repeated for each agent of the second generation until 
the total number of agents of this generation equals the 
number of the previous generation plus the top M 
agents of the previous generation. The second genera- 
tion of agents includes the top M agents of the prior 
generation. 

The agent server 22 then checks if the next generation of 
agents matches the last generation of agents stored in 
memory 19 to determine if the generations have stabilized. 
This may be determined by all the agents of a generation 
being the same, or the highest ranked agent of two succes- 
sive generations having the same (or approximately the 
same) fitness value, or the average fitness value of two 
successive generations of agents being equal (or approxi- 
mately equal). If so, the generations have stabilized, and the 
branch is taken to step 116, otherwise, the training pattern 
for this generation of agents is defined at step 104 and steps 
106-114 are repeated until either the conditions of steps 113 
or 115 are satisfied. The resulting evolved neural network 
should more accurately determine when documents are 
relevant. 

After a new neural network has evolved and has been 
embedded in the present agents, the agents continue to 
search the WWW, and the agent leader branches to step 54 
to expand the neural network based of the automatic or user 
relevance feedback, as described earlier. The agents under 
the agent leader continue to search the WWW until the user 
stops the search at step 59. A stop agent button 62d (FIG. 3B) 
on the GUI 21 may be selected by the user to stop the search 
of an agent leader selected in box 63. The continue agent 
button 62e may then be selected by the user to continue the 
search. When a search is stopped, its associated record in the 
Agent Leader Table is maintained. Data defining the present 
status of the search in terms of the present address of each 
crawler agent on the WWW, the contents of the document to 
be received queue, and any addresses in memory 19 pro- 
vided from search engines not yet retrieved by meta-search 
agents, is also stored in a database linked to the name of the 
agent leader. The user may later load the data for the agent 
leader by selecting the agent leader name listed in box 63, 
and clicking on the load agent button 62/ to instruct the agent 
server to load the saved data for a search from the database 
27 into memory 19. The settings button 62b may be clicked 
to direct the agent server 22 to display the user profile 
information in page 64 for the agent leader, and then the start 
agent button 62g to direct the agent server 22 to start the 
search. The user may remove an agent leader from the 
database by the user clicking on the remove button 62h^ 
which directs the agent server to remove the files associated 
with the agent leader in the database. FIGS. 3Aand 3B show 
an example the GUI 21. Other pages of a GUI may be used 
with different fields and buttons to enable a user to interface 
with system 8 of the present invention. 

The database may maintain a log of the events occurring 
during a search of an agent leader. The log may record, for 
example, each of the query expansions at step 54 and the 
state of the query after each evolution. The user may click 
on the show log button 62/ of the GUI 21 of FIG. 3A to 
instruct the agent server 22 to display the contents of the log 
from database 27 through the GUI. 

Multiple searches may run at the same time by defining 
multiple user profiles. This is shown for example in FIG. 3B 
in which the results of a search of another agent leader are 
provided in box 79fl. The status of each search is shown by 
its agent leader name in the GUI 21, such as in box 61. 



10 



20 



30 



45 



50 



60 



65 



Although this description refers to the WWW, computer 
system 10 may be used for searching one or more databases 
accessible by computer system 10 on CDROm, hard-disk, 
modem or LAN, in which the documents stored in the 
database have text and may be retrieved in accordance with 
a query. 

From the foregoing description, it will be apparent that an 
improved system for retrieving multimedia information 
from the Internet using multiple evolving intelligent agents 
has been provided. Variations and modifications of the 
herein described system and other applications for the 
invention will undoubtedly suggest themselves to those 
skilled in the art. Accordingly, the foregoing description 
should be taken as illustrative and not in a limiting sense. 

What is claimed is: 

1. A system for retrieving information on a computer 
coupled to a computer-based network, such as the Internet, 
in accordance with a query representing the information a 
user wishes to retrieve, said system comprising: 

means for producing a pluraUty of first agents and second 
agents in which said first agents each retrieve docu- 
ments at a different first network address and at other 
addresses linked from the document at the first network 
address, and said second agents each execute a search 
on a different search engines via the network in accor- 
dance with said query and retrieves documents at 
network addresses provided by the executed search; 

said first and second agents each comprising an artificial 
neural network trained in accordance with said query 
for determining for each of the retrieved documents by 
said agents a retrieval value representing the proximity 
of the content of the retrieved documents to said query; 
and 

means for displaying to the user the addresses of the 
retrieved documents above a threshold retrieval value. 

2. The system according to claim 1 further comprising 
means for enabling said user to input said query. 

3. The system according to claim 1 further comprising a 
natural language processor for determining subject catego- 
ries and terms representative of said query, and means for 
generating and training said artificial neural network having 
inputs in accordance with said subject categories and terms 
representative of said query. 

4. The system according to claim 3 wherein said natural 
language processor operates on each of the retrieved docu- 
ments from said first and second agents to determine the 
subject categories and terms representative of the retrieved 
document, and each of said first and second agents set the 
inputs of the artificial neural network of the agent in accor- 
dance with the subject categories and terms representative of 
each of the retrieved documents by the agent to determine 
the retrieval value of the retrieved document. 

5. The system according to claim 1 further comprising 
means for selecting which ones of said retrieved documents 
are the relevant to said query. 

6. The system according to claim 1 further comprising 
means for expanding the artificial neural network of said 
first and second agents in accordance with the frequency of 
subject categories and terms present in said selected relevant 
documents, and training said artificial neural network of said 
first and second agents in accordance with training patterns 
based upon said selected relevant documents. 

7. The system according to claim 6 wherein said expand- 
ing and training means is enabled periodically. 

8. The system according to claim 1 further comprising 
means for evolving the artificial neural network of said first 
and second agents in accordance with said retrieved docu- 
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ments by said agents and said selected relevant documents, 
in which multiple generations of third agents are generated 
having artificial neural networks with subcombination of the 
input of the artificial neural network of said first and second 
agents and each successive generation of third agents have 
a higher chance of obtaining inputs of artificial neural 
networks of third agents of the previous generation which 
performed best at classifying a group of said retrieved 
documents as relevant and non-relevant as provided by said 
selecting means. 

9. The system according to claim 1 further comprising 
means for enabling said user to select one or more retrieved 
documents on said displaying means as relevant. 

10. The system according to claim 1 further comprising 
means for automatically selecting the relevant retrieved 
documents. 

11. The system according to claim 1 further comprising: 
a Web browser; and 

a graphical user interface enabled through the Web 
browser for said user to input said query and informa- 
tion characterizing the type of documents to be 
retrieved, wherein said query and said information 
represent a user search profile. 

12. The system according to claim 11 further comprising 
an agent server for receiving said user search profile and 
generates an agent leader in accordance with said user 
search profile responsible for enabling said first and second 
agent producing means. 

13. The system according to claim 12 wherein said agent 
server responsive to receiving multiple different ones of user 
search profiles generates multiple different agent leaders in 
accordance with each of said user search profiles, wherein 
each of the agent leader are responsible for enabling said 
producing means to provide a different group of said first 
and second agents under each of the agent leaders. 

14. The system according to claim 11 further comprising 
a database and a database server for storing at least said user 
profile, information representing said artificial neural net- 
work of said first and second agents, and results of the 
retrieved documents in terms of at least their network 
addresses. 

15. The system according to claim 1 further comprising 
means for generating and training an artificial neural net- 
work common to each of said first and second agents in 
accordance with said query. 

16. The system according to claim 1 further comprising 
means for enabling said user to select at least one of said first 
network addresses. 

17. The system according to claim 1 wherein at least one 
of said addresses of documents retrieved by said second type 
of agents provides one of said first network address. 

18. A method for retrieving information on a computer 
coupled to a computer-based network, such as the Intemet, 
in accordance with a query representing the information a 
user wishes to retrieve, said method comprising the steps of: 

producing a plurality of first agents and second agents in 
which said first agents each retrieve documents at a 
different first network address and at other addresses 
linked from the document at the first network address, 
and said second agents each execute a search on a 
different search engines via the network in accordance 
with said query and retrieves documents at network 
addresses provided by the executed search; 

generating a trained artificial neural network common to 
each of said first and second agents in accordance with 
said query for determining for each of the retrieved 
documents by said agents a retrieval value representing 
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the proximity of the content of the retrieved documents 
to said query; and 
displaying to the user the addresses of the retrieved 
documents above a threshold retrieval value. 
5 19. The method according to claim 18 further comprising 
the steps of: 

selecting which of said displayed addresses of the 

retrieved documents are relevant; and 
periodically revising and training said artificial neural 
network of said first and second agents in accordance 
with said selected retrieved documents. 

20. The method according to claim 18 further comprising 
the step of: 

evolving said artificial neural network of said first and 
second agents in accordance with said retrieved docu- 
ments and said selected retrieved documents. 

21. The method according to claim 18 further comprising 
means for enabling said user to select at least one of said first 
network addresses. 

22. The method according to claim 18 wherein at least one 
of said addresses of documents retrieved by said second type 
of agents provides one of said first network addresses. 

23. A system for retrieving information from the Intemet 
utilizing multiple intelligent agents comprising: 

a computer system having a graphical user interface to 
input a query, means for accessing the Intemet, means 
for producing a plurality of agents in which each of said 
agents retrieves documents at a first address on the 
Intemet and at other addresses linked to the document 
at the first address, and means for determining the 
subject and important terms of the text of the query and 
of each the documents retrieved; 
each of said agents having a common neural network for 
35 determining the relevancy of each of the document 
retrieved by the agent, said neural network having a 
pluraHty of inputs and an output in which said inputs 
are based upon the subject and important terms of the 
query and said output representing a relevance value of 
40 each of the documents applied to the neural network; 
said computer system having means for training the 
neural network in accordance with the query, means for 
selecting which of said retrieved documents are 
relevant, means for periodically updating the neural 
45 network with additional inputs based on the subject and 
important terms of the selected retrieved documents 
and training the updated neural network using said 
selected retrieved documents to provide a retrained 
neural network for each agent, and means for evolving 
50 and training a plurality of different neural networks in 
which each is based on subset of the inputs of the 
retrained neural network, and iteritively evolving and 
training a new set of different neural networks having 
a subset of the inputs of such evolved trained different 
55 neural networks which best classifies documents as 
relevant until one of the evolved neural networks is the 
best classifier of documents as relevant to provide said 
one neural network for each agent; and 
said graphic user interface displaying to a user the results 
60 of the documents retrieved by said first and second 
agents. 

24. A method for neurogenically evolving a parent arti- 
ficial neural network having a plurality of inputs each 
characteristic of a different feature using multiple sets of one 

65 or more of said training features in which each set has a 
classification of a plurality of known classes, said method 
comprising the steps of: 
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generating a plurality of agents each having an artificial 

neural network with a different subset of said features 

of said parent artificial neural network; 
dividing said multiple sets into a training group and a test 

group; 5 
training the artificial neural network of each of said agents 

with said multiple sets of said training group; 
testing the artificial neural network of each of said agents 

using said multiple sets of said test group to determine 

the number of sets correctly classified; 
determining a fitness function for each of said agents the 

number of sets correctly classified by the total number 

of sets in said test group; 
rank the agents by their fitness function; 15 
generating a plurality of next generation agents each 

having an artificial neural network with a different 
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subset of said features of said parent artificial neural 
network in which said artificial neural network of the 
next generation agents has a greater chance of includ- 
ing said features of the artificial neural network of said 
higher ranked agents of the prior generation; 
repeating said training step, testing step, determining step, 
identifying step and said step of generating a plurality 
of next generation agents in accordance with said next 
generation of agents until one of a maximum number of 
generation of agents have been produced, and two 
successive generations of agents each having identical 
features are produced, in which the agent having the 
highest fitness function represents an evolved artificial 
neural network. 

« 1|C ))t * * 
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